Skip to content

WPB-25915 add timeout and duration metric for conversation migration#5244

Merged
battermann merged 23 commits into
developfrom
WPB-25915-add-timeout-and-duration-metric-for-conversation-migration
Jun 11, 2026
Merged

WPB-25915 add timeout and duration metric for conversation migration#5244
battermann merged 23 commits into
developfrom
WPB-25915-add-timeout-and-duration-metric-for-conversation-migration

Conversation

@battermann

@battermann battermann commented May 28, 2026

Copy link
Copy Markdown
Contributor

https://wearezeta.atlassian.net/browse/WPB-25915

I have tested that it works. However, those test cannot be committed, because they hook into the production migration code to simulate the blocking conversation migration.

Checklist

  • Add a new entry in an appropriate subdirectory of changelog.d
  • Read and follow the PR guidelines

@zebot zebot added the ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist label May 28, 2026
@battermann battermann marked this pull request as ready for review May 28, 2026 07:27
@battermann battermann requested review from a team as code owners May 28, 2026 07:28
@battermann battermann requested a review from Copilot May 28, 2026 07:28

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds timeout handling and duration observability for background-worker conversation migrations to make stuck per-conversation attempts visible and fail-fast.

Changes:

  • Adds optional timeout to MigrationOptions and applies it to per-conversation migration attempts.
  • Registers and records a Prometheus histogram for conversation migration attempt durations by outcome.
  • Documents the new timeout setting and updates package dependencies for Timeout time-unit support.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
services/background-worker/src/Wire/PostgresMigrations.hs Registers and passes the new conversation migration duration histogram.
services/background-worker/src/Wire/BackgroundWorker.hs Updates existing MigrationOptions constructor calls for the new timeout field.
libs/wire-subsystems/src/Wire/Migration.hs Adds timeout configuration and timeout exception type.
libs/wire-subsystems/src/Wire/ConversationStore/Migration.hs Applies per-conversation timeout logic and records duration metrics.
libs/types-common/types-common.cabal Adds polysemy-time dependency.
libs/types-common/src/Util/Timeout.hs Derives TimeUnit for Timeout.
libs/types-common/default.nix Adds polysemy-time to Nix dependencies.
docs/src/developer/reference/config-options.md Documents the new migration timeout option.
changelog.d/5-internal/WPB-25915 Adds a changelog entry file, but it is currently empty.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/src/developer/reference/config-options.md Outdated
Comment thread docs/src/developer/reference/config-options.md Outdated
Comment thread docs/src/developer/reference/config-options.md Outdated
Comment thread docs/src/developer/reference/config-options.md Outdated
Comment thread docs/src/developer/reference/config-options.md Outdated
Comment thread docs/src/developer/reference/config-options.md Outdated
Comment thread libs/wire-subsystems/src/Wire/Migration.hs Outdated
Comment thread services/background-worker/src/Wire/BackgroundWorker.hs Outdated
Comment thread services/background-worker/src/Wire/PostgresMigrations.hs
Comment thread changelog.d/5-internal/WPB-25915

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 7 comments.

Comment thread services/background-worker/src/Wire/PostgresMigrations.hs Outdated
Comment thread services/background-worker/src/Wire/PostgresMigrations.hs Outdated
Comment thread services/background-worker/src/Wire/PostgresMigrations.hs Outdated
Comment thread services/background-worker/src/Wire/PostgresMigrations.hs Outdated
Comment thread libs/wire-subsystems/src/Wire/Migration.hs Outdated
Comment thread libs/wire-subsystems/src/Wire/CodeStore/Migration.hs Outdated
Comment thread changelog.d/5-internal/WPB-25915 Outdated
battermann and others added 4 commits June 5, 2026 10:00
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@battermann battermann force-pushed the WPB-25915-add-timeout-and-duration-metric-for-conversation-migration branch from 4c4c132 to cd421ac Compare June 5, 2026 10:31
@blackheaven

Copy link
Copy Markdown
Contributor

@battermann the CI was flaky and it lacks an approval from @akshaymankar

@akshaymankar akshaymankar left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Nits in comments.

Comment on lines +172 to +173
migrateConversationWithLock timeout_ cid =
withExclusiveMigrationLockAndTimeout timeout_ migDuration [cid] $ migrateConversation cid migCounter

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the only migration where we pulled out the withExclusiveMigrationLockAndTimeout outside the migrate<Thing> function. Why so?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the code again. It looks correct for all migrations. For the other migrations the code is inlined and for conversations it is extracted into a helper function. That is all. There is no reason.

Let me know if you want this changed, otherwise I will merge. 🙏

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good, I was just curious what happened. :shipit:

Comment thread libs/wire-subsystems/src/Wire/Migration.hs Outdated
blackheaven and others added 2 commits June 11, 2026 09:42
Co-authored-by: Akshay Mankar <akshay@wire.com>
@battermann battermann merged commit 07f3057 into develop Jun 11, 2026
11 checks passed
@battermann battermann deleted the WPB-25915-add-timeout-and-duration-metric-for-conversation-migration branch June 11, 2026 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants